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Abstract: We develop a study of ignorability and conditions thereof for 
likelihood inference in the framework of stochastic processes. We define a 
general coasening model for processes which includes discrete-time observa- 
tions as well as censored continuous-time observations and applies to contin- 
uous state-space processes as well as counting processes. For preparing the 
work we recall formulas for manipulating marginal and conditional likelihood 
ratios (which can apply to stochastic processes). Ignorability is defined in 
terms of local equality of two likelihood ratios. We give general conditions 



1 



of ignorability and then dynamical conditions which are more interpret able. 
We illustrate the use of the dynamical conditions of ignorability in problems 
of censoring, missing data and joint modelling. 

Keywords: coarsening, censoring, counting processes, ignorability, likeli- 
hood, missing data, Radon-Nikodym derivatives, stochastic processes. 



1 Introduction 

Incomplete data are very common in statistics: when the mechanism leading 
to incomplete data (m.l.i.d.) is fixed a relatively simple likelihood can be 
written in general. Often the m.l.i.d. can not be considered as fixed and 
the question arises whether it can still be ignored. Rubin (1976) introduced 
the concept of ignorability for the simplest case in which the observation 
is a sequence of random variables and some of them are missing; he gave 
conditions under which inference based on the assumption of fixed m.l.i.d. 
was valid, even in the case when in fact it was not fixed. He established a 
typology of cases of missing data and showed in particular that the m.l.i.d. 
was ignorable for likelihood inference in the case of missing at random (MAR) 
observations. 

In the framework of survival analysis the most frequent cases of incom- 
plete data is right-censoring (Kaplan and Meier, 1958; Cox, 1972) and inter- 
val censoring (Peto, 1973). Conditions under which the conventional likeli- 
hood for right-censored survival data was valid have been studied (Lagakos, 
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1979; Kalbfleisch and Prentice, 1980). Andersen et al. (1993) studied the 
concept of independent censoring in the counting process framework. Heitjan 
and Rubin (1991) also studied some less conventional incomplete data cases 
which they called "coarsening". This topic was also studied by Jacobsen 
and Keiding (1995), Gill et al. (1997) and Nielsen (2000). The problematic 
of non-ignorable m.l.i.d. has prompted the development of joint models, in 
which the m.l.i.d. was included, for instance in a model proposed by Diggle 
and Kenward (1994); see Thiebaut et al. (2005) for a recent example. 

The aim of this paper is to study ignorability in the context of stochas- 
tic processes: these processes may be counting processes but also continuous 
state-space processes, such as diffusion processes. For giving a rigorous treat- 
ment of that topic, we will need to rely on basic probability tools. First we 
will speak in terms of likelihood ratio which is defined as a Radon-Nikodym 
derivative: this enables to manipulate likelihood ratios for the observation of 
stochastic processes (for a review see Barndorff-Nielsen and Sorensen, 1994). 
Local equalities of cr-fields and of random variables will play an important 
role in the very definition of ignorability and in the proofs and we recall a 
probability result described in Kallenberg (2001): 

Lemma 1 Let the a-fields , Q d A and functions ^,7] E L} he such that 
AnJ^ = Ang and^ = 7] a.s. on some set A e J^DQ . Then E[^\J^] = E[7]\g] 
a.s. on A. 

Also, results on the likelihood of point processes due to Jacod (1975) will 
play a key role in several proofs. 

We begin in section 2 by recalling the general definition of the likelihood 
ratio and of marginal and conditional likelihoods; a set of useful formulae 
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is given. In section 3 a coarsening model for stochastic processes is given. 
Then in section 4 we present a new formulation of the incomplete observa- 
tion problem based on cr-fields and we give a definition of ignorability. In 
section 5 we give general conditions of ignorability and in section 6 a dy- 
namical condition which is more interpretable and usable than the general 
ones in some contexts. Finally section 7 illustrates the use of the theory in 
survival models, longitudinal data and joint models, and section 8 is a short 
conclusion. 



2 Full, marginal and conditional likelihood 

Consider a measurable space (fi,jF) and a family of measures {Pejeee abso- 
lutely continuous relatively to a dominant measure Pg^. For X a sub-cr- field 
of JF the likelihood ratio on X is defined by: 

e/eo _ dPe 

where ^^-^ is the Radon-Nikodym derivative of Pq relatively to Pq^. Re- 
call that is the A'-measurable random variable such that PeiF) = 
fp dPon, F E X. U X C y C J-' we have the fundamental formula 
(WiUiams, 1991): 

Note that because conditional expectations and likelihood ratios are defined 
a.s., all the equalities involving them are to be understood as a.s., even if 
this is not specified for sake of notational simplicity. 

When JF is generated by two random elements X and Y and denote 
by X and y the a-fields they generate respectively; thus = X \/ y-, we 



will note 



C^xy. The likelihoods and C^y^^ are called 



marginal likelihoods, and are linked to the full likelihood by the conditional 
expectations: = E[£^^|A'] and C^y^° = E[£^^|3^], as derives from the 

fundamental formula. 

Conditional likelihoods can also be defined (see Hoffmann- Jorgensen, 
1994); for brevity we do not recall the definition. The conditional likeli- 
hood ratio of Y given X will be denoted Cy^^^. The following properties will 
be used in this paper: 
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Note that ii) is the generalization of the main property of the likelihood 
ratio to conditional expectations; thus C^y^x deserves its name of conditional 
likelihood ratio; it also implies Eg^ C^y^^\X = 1. Note also that under the 
assumption of a family of equivalent measures, all the likelihoods are strictly 
positive a.s. 
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3 A general coarsening model for processes 

We define a General Coarsening Model for Processes (GCMP). In many real 
studies, in particular in epidemiology, we would have to consider a sample of 
n independent "subjects", to each of whom a process = {XI) = {Xl)t>Q 
would be associated. Since the likelihood would be the product of the in- 
dividual likelihoods, it is sufficient to consider only one process. We first 
consider a process X = (Xt) where X^ takes values in 3?, then we will extend 
the model to a multivariate process. The main objective is to describe obser- 
vation schemes for processes in continuous time t, but t may also be discrete 
so that the results can be applied to finite collections of random variables. 
We shall consider a response indicator process R = (Rt) taking value 1 at t 
if Xt is observed and otherwise; this a generalization of the response indi- 
cator variable introduced by Rubin (1976). This unifies different concepts of 
censoring and observation of longitudinal data. For instance particular cases 
are: 

i) right-censored survival data: case where X is a — 1 counting process and 
i?f = 1 if t < C, otherwise, where C is a censoring variable; 

ii) left-censored survival data: case where X is a — 1 counting process and 
-Rf = 1 if t > C, otherwise; 

iii) interval-censored survival data {X is a counting process) or repeated 
measurements {X has a continuous state space): case where and Rt = 1 ii 
t G {Vi, V2, . . . , Vrn}, otherwise. 

Note that C in cases i) and ii), and V,-, j = 1, . . . ,m in iii) are random 
variables. Cases i) and ii) illustrate a situation where R is either right- or left- 
continuous at each jump time and correspond to observation in continuous 
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Figure 1: right-censoring 
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Figure 2: Continuous monitoring followed by discrete-time visits 



time on some windows (see Figure 1); case iii) corresponds to observations in 
discrete time. In the latter case Rt = 1 only on a finite or denumerable set 
of the half line [0, +oo[. The GCMP allows to represent a large number of 
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non-standard observation schemes. For instance the subjects can be observed 
on windows separated by periods where no observation are taken. In most 
apphcations the process X will be either observed continuously on some 
periods or observed only at discrete time points, but the GCMP can represent 
a mixing of the two types of observation for the same process: for instance a 
subject could be observed in continuous time when he is at hospital and at 
discrete times when he has left hospital (see Figure 2). Also the process X is 
not necessarily a 0-1 counting process but may be for instance a more general 
counting process, allowing recurrent events, or a process with continuous 
state-space like a diffusion process. 

That the GCMP includes censoring or coarsening models for random 
variables is obvious from the fact that to each random variable Y we can 
associate a counting process Xt = l{y>t}- We can define a coarsening model 
for a random variable Y by partitions V defined by intervals Aj and indicator 
Uj which take values 1 or 0: if Uj = 1, Y is exactly observed on {Y G Aj}, 
if Uj = it is only observed that Y falls in A*. This is equivalent to the 
GCMP for Xt defined by Rt = Y,i^j^Aj{t)- In order to construct a random 
mechanism we can take a random partition, for instance determined by a set 
of random variables. 

The model can be extended to the case of multivariate processes, as may 
be required by the observation of several processes on the same "subject". 
So, we may consider that X^ takes values in 3?'^, d > 1 and we may also 
consider a multidimmensional response process (Rt)- 

Still another extension is to allow that if i?^ = 1, is not observed 
completely, but according to a fixed m.l.i.d.: we shall refer to that case 
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as "vertical coarsening" . This would apply to left-censored observation of a 
biological marker due to a detection limit (such as HIV-RNA), as exemplified 
in section 7.2. 

Remark. The (Rt) process is a generalization of the response indicator 
variables of Rubin (1976). It is different from the filtering process proposed 
in Andersen et al. (1993) (section III. 4): for instance neither left-censoring 
nor discrete observation times can be treated with Andersen et al. filtering 
process. See also Arjas, Haara and Norros (1992) and Arjas and Haara 
(1992) for a detailed treatment of the filtering problem in the framework of 
marked point processes. The (Rt) process can also be considered as a special 
case of the auxiliary random variable G used by Heitjan and Rubin (1991) 
or Jacobsen and Keiding (1995) if we interpret this random variable as a 
random element in a Skorohod space. 

4 Ignorability 

4.1 The sigma- field representation for incomplete data 

A model for the random element X is a family of measures {Pe}eee on a a- 
field X generated by X (for us, X will be a stochastic process: Xt takes values 
in 3?^^ while the path of X is an element of a Skorohod space). We will assume 
that the measures in the family are equivalent and take Pe^ as the reference 

n In 

measure. If we observe PC we will use the likelihood C;^ for inference about 
9. We will represent the observed events by a cr-field O. A general definition 
of incomplete data is: X O. K simple case of incomplete data is when 
only a sub-a-field of X has been observed: O d X . K particular case occurs 
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when the m.l.i.d. can be represented by a GCMP with R a deterministic 
function. We shall denote by {R = r} the event {Rt = rt,t > 0} where rt 
is a particular path (an element of the Skorohod space for instance). If R is 
deterministic there is a value (a path) r such that {R = r} = Q; for instance 
Rt = 1 for t < c and Rt = for t > c, where c is fixed. In such a case we 
have O = = a{Xt,t > : rt = 1) = a{rtXt,t > 0). In that case we have 
Co = which is in general relatively easy to compute. For instance in the 
right-censored case above, if X is a counting process, Jacod's formula ( Jacod, 
1975) can be used to obtain the likelihood; in the more particular case where 
X is a — 1 counting process with only one jump time T, the likelihood takes 

the form '^^ij^^2l{-£llll)ly ^^^^^ ^ = ^{^<^} "^(•) ^^^^P" "o^")) 

risk functions under Pq (resp. Pq)- Feigin (1976) has given the likelihood for 

a diffusion process observed in continuous time. If a process is observed at 
fixed times . . . the likelihood can be computed as the likelihood for 
observation of the vector of random variables X^^ , • • • , X^^ ; if X is a — 1 
counting process this case has been denoted "interval censoring" and the 
likelihood is simple to compute (see Peto, 1973; Alioum and Commenges, 
1996); if X is a gaussian process, the vector X„^,...,X^,^ has a normal 
distribution which makes the likelihood easy to compute. 

If R is not fixed, the above definition of O is meaningless; we must in- 
clude R in the description of the problem. We shall consider a larger cr-field 
T = XyTZ, where TZ is the cr-field generated by R for right- or left-continuous 
processes that is: TZ = cr{Rt, t > 0); if i? takes value 1 at only a finite (or de- 
numerable) number of times (corresponding to the discrete observation case) 
we can take TZ as generated by the counting process counting the number of 
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observation times. We consider that R is observed (see Remark 2) so that a 
representation of O is O = a{RtXt, Rt,t > 0). In section 6 which develops a 
dynamical approach to the problem, we shall define adequate filtrations; for 
instance if R is cadlag (right-continuous with left-hand limits) the observed 
filtration (Ot) will be the family of a-fields Of = a{RuXu, Ru,0 < u <t). 
We have of course O d T\ generally we have X O (incomplete data) and 
O 7^ X"^ (the observation is not a predetermined subset of values of X). 
Remark 1. We might think that we could define an interesting a-field by 
a{RtXt,t > 0) which could take the role of the notation Xobs used in most 
of the literature in missing data (for instance Kenward and Molenberghs, 
1998); however if 7^ for all t, the latter a- field is equal to O. When R 
is random it is not possible to disentangle the observed part of X from R; 
only the realized value of R effects a partition between Xobs and Xmts- This 
is in fact the meaning of X^ which is the observed part of X when R = r. 
Remark 2. It is natural to say that R is observed: for each t we know 
whether we observe Xt or not. There is however an important case where 
this natural assumption does not hold: in survival analysis, X is a — 1 
counting process, so after a jump has been observed there is no need for 
observation anymore; so we may ignore whether we would have observed 
the process if it had been necessary. The simplest way to get out of this 
problem is to put -Rt = 1 if Xt is known. More generally assume that there 
is an absorbing state a such that if Xj = a, Xj_|_„ = a for u > 0; define 
the Or stopping time T = inf{t : Xt = a and Rt = 1}. By convention put 
Rt+u = 1,u > 0. This part of the law of R is anyway unidentifiable. In the 
remaining of the paper we will consider that R is observed. 
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4.2 Model and notations 

From now on, a model for the random element (X, R) is a family of measures 
{P{e,ip)}{e,ip)£ex^ on a mesurable space (f2,jF). X (resp. i?) takes values in a 
mesurable space (H,,^) (resp. (r,p)). For us X and i? will be d-dimensional 
cadlag stochastic processes, so (H, ^) and (F, p) are Skorohod spaces endowed 
with their Borel a-fields. The parameter spaces 6 and \1/ need not be finite 
dimensional. We will assume that the measures in the family are equivalent 
and take -P(eo,V'o) reference measure. Pg is the restriction of P(e,^) to X: 

that is, the marginal probability of X does not depend on ip. The additional 
parameter ip will be considered as a nuisance parameter. We assume im- 
plicitly a " Non-Informativeness" assumption in the coarsening mechanism, 
which is : 



P(^eu^){A\X) = P^e,,^){A\X), a.s., A e 7^ for all ^1,^2,^ (1) 

In words, the conditional probability of R given X does not depend on 6. 
This has an important consequence in terms of likelihood ratio. The latter 
assumption can also be written as: E(^q-^^^){1a\X) = E(^q^^^){1a\X), A & 71, 
which remembering property ii) of the conditional likelihood is equivalent 
to E(,„,^,)(U/:5^[f = E(,„^,)(Ui:(^f^^)/('-'^°)|A'). This in turn im- 
phes that C^^'^f/^'^"'"^"^ = /:(j2W')/(eo,^o) ^.ji ^^^^^^ ^^-^ common value 

C^ll^ . Moreover it can be proved (and is intuitive) that C^\x° = 1, a.s.. 

From now on, we fix our notation for O and X"^ hj O = cr{RtXt, R^, t > 0) 
and for r = (r^) a deterministic path of R, = a{Xt,t > : = 1) = 
a{rtXt,t > 0). Note that remains true even if A G (9. The inferential 
likelihood is (which is (9-measurable and thus can be computed 
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from the observations) and the fundamental property yields: 

where ^J^/^O'^^o is the full likehhood. 
4.3 Definition of ignorability 

If the m.l.i.d. (represented by K) is random it may still be tempting to ignore 
it, treating it as fixed, and use for inference Lx^ which is relatively easy to 
compute and does not depend on ^. 

Definition 1 In the GCMP, the likelihood ratio ignoring the m.l.i.d. is the 
likelihood ratio obtained under the assumption that R is fixed at its 

observed value r. 

When the fixed m.l.i.d. assumption does not hold, the question arises to 
know in which cases (if ever) leads to the same inference about 6 as 

j^e^o/eo,ii>o ^ if the true value of ip was known to be t/^q or f^^l^^'^'^ \{ ^ j^ad to 
be estimated. 

For defining ignorability we face the problem that both Radon-Nikodym 
derivatives and conditional expectations are defined almost surely; for some 
results we must restrict the theoretical framework to measures giving a non 
null probability to a denumerable set of trajectories of R. 

Assumption Al There exists a denumerable set (ri,r2,...) such that 
Pg(R = ri)>0 and PeiR = r^) = 1, for all 9. 

This is a theoretical limitation but this has no impact on application since 
in practice the times are always rounded. Some of the results below will need 
this restriction, other will not. 
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Definition 2 (Ignorability) The m.l.i.d. will he caHec? ignorable on r if 

j^ioM/idoM ^ ^e/eo ^^^^ _ whatever tPo- It will be 

called ignorable if assumption Al holds and the m.l.i.d. is ignorable on r 
for all values of r. 

Remark. If the m.l.i.d. is ignorable it is then obvious that £^r" and 
£^.'/'o)/(6'o,V'o) jgg^^ same inference about 6. 

4.4 Extension to vertical coarsening 

This approach can readily be extended to fixed vertical coarsening, that is 
the case where, when i?j = 1, is incompletely observed according to a 
fixed m.l.i.d. (see section 3). Let O (resp. O') the observed a-field for the 
completely vertically observed mechanism (resp. the incompletely vertically 
observed mechanism). Since the vertical m.l.i.d. is fixed, O' C O. On {R = 
r} we have O = X'' and O' = X'"^ (where X'"^ is the coarse observation of X'''^ 
with X'"^ C X"^ . If ignorability holds for the completely vertically observed 
couple (X, K) we have on \R = r}, j[f^^o/^o,i'o _ ^ Kallenberg Lemma 
we have, on {R = r}, E{C^s'^"^^''^''\0') = E{C^Jr"\X"), from which we deduce 
that on {R = r}, = /Z^,^,"; that is, ignorability also holds for the 

fixed vertically coarsened pattern. 



5 Static conditions of ignorability for the GCMP 

We give a first fact which does not seem to have been noted previously in a 
general context. 

Fact. Ignorability on {Rt = l,t > 0} always holds. 
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Proof. On the event {Rt = l,t > 0}, X and R are observed so that 
J^=0; thus from LemmaEwe have = E^g.^^.^iCf^^^^'^'^^^lO) = 

E^eoMiUl^) = c^^'^y^'-M ^ (the last equahty comes from 

property iii) of the hkehhood); for tp = tpQ we retrieve the hkehhood for 
the complete observation of X and ignorability holds on {Rt = l,t> 0}. □ 
We shall now study "static" conditions of ignorability, in contrast with the 
"dynamic" conditions of the next section. Gill & al (1997) have introduced 
two conditions of ignorability: CAR(REL) (Relative Coarsening At Random) 
and CAR(ABS) (Absoluter Coarsening At Random); these conditions were 
further studied by Nielsen (2000). We give two conditions of ignorability 
in the GCMP framework. The first one is an adaption of CAR(REL); the 
second one is stronger but original and doesn't imply CAR(ABS). 

Definition 3 (CAR(GCMP)) We will say that CAR(GCMP) holds for 
the couple {X,R) if Cl^l^ is O -measurable for all {O^ip). 

With our notations the condition CAR(REL) is "there exists a version 
of C^^\o,x) such that the mapping x ^^'^^"(o, a;) is constant for all x 
compatible with o", where a is an elementary event of (9, i.e. a = (r, rx). 
First note that with our assumptions, x is compatible with o = (r, y) iirx = y 
and second note that CTi\x{r;x) = Co\x{o;x) if o = (r, rx) (by property 
vi) and because TZ \/ X = O \/ X . So in the GCMP setting the condition 
CAR(REL) becomes "there exists a version of £^^^°(o, x) such that : for all 
r and {x,x') verifying rx = rx' then C^^^°{r,x) = C^^^{r,x')" . 

Theorem 1 CAR(GCMP) is equivalent to CAR(REL) m the GCMP set- 
ting. 
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Proof. If CAR(REL) is true then for all {r,x), /:^/J(r;x) = ^^{^(r; rx) 
(by taking x' = rx in the above formula) and then C^^^° is a function of 
{R,RX) and so is O-measurable that is CAR(GCMP). Conversely, if 

^n\x 0-nieasurable, there is a version of it which is constant on the atoms 
of O i.e. on the set of the form Or,y = {{r,x) such that x verifies rx = y} 
and we have CAR(REL). □ 
The next theorem shows that CAR(GCMP) implies a factorization of the 
likelihood in two parts : one which depends on ip and the second one on 6; 
this may be called "weak ignorability" . 

Theorem 2 (Factorization) // the couple {R,X) satisfies CAR(GCMP) 
then we have c^^'^^^'oM ^ /:^/jE(e„,^,)(£i/'°|0) andE^eo,iH,){^T°\0) does 
not depend on ipQ. 

Proof. A proof can be obtained using the previous theorem and the fact 
that CAR(REL) has been proved to imply a similar factorization theorem. 
However a direct proof is quite simple in our formalism. From the de- 
composition formula we have £(^'''')/(^o,'0o) _ 

^n\x °f conditional expectation using CAR(GCMP) thus obtain- 
ing c^^'^y('oM ^ £^/JE(9„,^„)(£^/'°|0). It only remains to prove that the 
last term does not depend on ipQ. We have 

^{e,^p)/{^o,^Po) _ ^(6»,V>)/(6»o,^i)^(0o,^i)/(eo,V'o) 
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As a result, we can use E(£)q ,^o)(£_;/ °\0) for inference on 9, getting rid of 
the nuisance parameter ijj. The following condition allows obtaining a more 
precise result locally. 

Definition 4 (CAR(GCMP)-loc) We will call CAR(GCMP)-loc onr the 

condition: C^^^J^^'°'^'^ = C^^^J/^'''^°^ a.s. on {R = r}. 

Remark. The above definition has a meaning only for r such that P{{R = 
r}) > 0. 

The GCMP model verifies the condition CAR(ABS) defined by Nielsen 
(2000) if V(^,V^), P(e,^) is CAR(ABS) i.e. the following condition is true : 

for Pg a.e. x, x', for every A E O, 

P4AnD,nD,,\x = x) = P4AnD,nD,,\x = x') (2) 

where = {{r,y) G (S,r);ry = rx} G O. Note moreover that as pointed 
by Gill & all (1997), if P(eo,i,o) is CAR(ABS) and the model is CAR(REL), 
then the model is CAR(ABS). 

Theorem 3 // the GCMP is CAR(ABS), then CAR(GCMP)-loc holds on 
all r. 

Proof. Assume the model is CAR(ABS) and remark that {R = r} fl = 
{R = r}n D„ and that \/i9,ip), P^g^^)[D^\X = x] = 1, then V(^,?A) for Pg 
and every (9- measurable Z, we get 

^{e,i')[^{R=r}Z\X = x] = E(^0^jij)[l{ji=r}ZlDjX = x] 

= E(_g^^)[l{R=r}Z\X = rx] 
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From this equality, we deduce that V(6', ip) for every Z, C-measurable, E{e,i') [i{B=r}Z\X] 
E(,,^)[l{ii=.}Z|A'^. It follows that for every A G7^: E(e„,^,o)[l{fl=r}lA/:S?^^''"'^"^|^] = 



E(e,^) [l{R=r}lAC^n\S/^^"'^°'^\X] which implies CAR(GCMP)-loc. □ 



Theorem 4 CAR(GCMP)-loc on r implies ignorability on r. 

Proof. By the iterated decomposition formula (property v), we have 

'~'X,x-^,n — '~'x\x^,n L-'r.\x^L-x-^- WeJiave>L0 —'^\^x,n I'-^J — 

Ln\xrCxr\0\. On {R = r},0 = X'NH and by Lemma Owe 

have: 

^(9,^)/(9o,V'o) _ ^0/eo^(e,5/;)/(9o,^o)p \Ae,■^)|{ei^A\^)\yrs|^^ _ ^e/eo Ae,i')/{eo,i>o) 

If CAR(GCMP)-loc holds we have on {R = r}, C^^^J/^'"'^"^ = d^^^'^'^^"^"^ ■ 
we have = Thus with both conditions we have = 

C^Jfr'^ C^^^ ^ a.s. on {R = r} and this implies ignorability. □ 



6 Dynamical conditions of ignorability 

6.1 R representable by a counting process 

Assume that we can represent (Rt) by a marked point process (Nt); if the 
component Rh of R takes value 1 at isolated points Vh,i, Vh,2, ■ ■ ■ (discrete-time 
observations), then its associated counting process Nf^^ is defined by A''^^ = 
J2'jLi ^{Vhj<t}- If the component Rh^t is cadlag, it can be written as Rh^t = 
T,T=o MvH,2k,Vh,2k+i[ and its associate counting process = Ej~o Uv,,,,<t}- If 
the component R^ of i? is a mixing of the two kinds of observation, (discrete- 
and continuous-time) then at points of discontinuities we have to add a mark 
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which indicates if the jump affects the discrete part of Rh represented by A^^ ^ 
or the cadlag one represented by ^. In all cases R can be represented by a 
marked point process {Nt). Denote by (A/j) the self-exciting filtration of {Nt). 
We define the filtration (Ot) as the family of a-fields Ot = criN^, RuX^, < 
u <t). Let us call A'-^'^ = (Af'^) and A-^*'^ = [kf the compensators of 
in the filtration iOt) and {T^) respectively and for probability Pe,^-, where 
(jFj*) is the family of cr-fields = X\/Ot, t > 0. The compensators generally 
depend on {6, ip) but we omit this for notational simplicity; the compensators 
for PeQ,^Q will be denoted Aq. In the following we will assume that there exist 
a fixed time r such that Rr+u = Rt and that there is no explosion of the 
process on [0, r] (i.e. I]t<T < oo) (thus A:^„ = A:^, n > 0, for any 
filtration) . 

Definition 5 (CAR(DYN)) We will denote CAR(DYN) the condition: 

V {O^ip) we have under P(^g^^^ : (Af'^) = {Af '^), (up to indistin- 
guishability) . 

Remark. This is an absolute condition in the sense of Gill et al. (1997) and 
Nielsen (2000) since the CAR(DYN) criterion is defined for each probability 
separately while the definition of CAR(GCMP) bears on Radon-Nikodym 
derivatives. 

Theorem 5 CAR(DYN) implies CAR(GCMP). 

Proof. Let us write the likelihood for and X. The filtration {T*) is the 
self-generated filtration for A^ when J^^ = X . Thus we have using Jacod's 
formula C^, = C^J:^''(P{k^*^^ ,kZ'^ ,Nu,u> 0), where 

0(A, Ao, N) = 0(A., Ao., iV., < u) = n n f ) ^^^^ n.>_o....Ml - dA,) 

19 



where A^. = J2h^h and A = J2h-^h- Identifying with the decomposi- 
tion Cr- = C'x'^'^nx we find that = 0(A^*'^, Af A^). With 
CAR(DYN) this is equal to 0(A^'^, A^'^, A^) which is C-measurable, and 
thus CAR(GCMP) holds. □ 

Theorem 6 CAR(DYN) implies CAR(GCMP)-loc for allr. 

Proof. Let us define J^l = A"^ V M and J^" = X"" y N then for all t G [0, t], 
rt C J-;. As seen in the previous proof, /:gjJ)/(''o,V'o) _ 0(A^%Af, A^*'^, iv) 
and = </.(A^^'^,Af •'^,A^). So the proof will be complete if 

Vt G [0,t], Ar''^l{fl=r} = Ar'^l{ii=r}. We get 

Ar'^l{i?=.} = i?[Ar'^|^]l|«=,} (3) 
= E[Af'^l|^.,,=,.,,||^[]l{^=,} (4) 
= Af'^l{ij=r} (5) 

(0) is due to the innovation theorem, (jU is due to CAR(DYN) and the 
fact that {-R.At = ''".At} C {R = r} and {R,m = '''.At} G Ct fl At last 
the local inclusion {i?.At = '".a* At = ''".At} n J^-J implies the 

the desired equality. 



measurability of Af '^1 r^^^^^^^ and (0). Using again CAR(DYN), we have 



Corollary 1 If R is Ot-predictable, ignorability holds. 

Proof. The associated counting process A^ is itself jF*-predictable. Thus 
the Doob-Meyer decomposition is At = At + in both {Ot) and {J^t)- 
follows that A-^*'^ = A^'^ = A^ which is CAR(DYN). By Theorem El and H 
ignorability holds. □ 
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In the framework of counting processes, Andersen et al. (1993) have 
proposed a criterion of independent right-censoring. We adapt their criterion 
to right-continuous censoring processes and we restrict to the univariate case 
for simphcity. In that case Rt = l{t<c} where C is a censoring variable and 
Nt = 1 — Rt. Let X be a counting process and A"^'"^ its compensator in 
the self-generated filtration (A't). Let {J^t) the filtration generated by both 
X and A^. We have independent right-censoring if the compensator of X 
is the same in the filtration including information on the censoring that is : 

Theorem 7 LetX be a counting process which admits a cdg (left- continuous) 
intensity X'^'^ in the self-generated filtration {Xt). Consider a right- continuous 
right- censoring process of X satisfying CAR(DYN); then this is an indepen- 
dent censoring. Inversely, independent censoring implies CAR(DYN). 

Proof. The likelihood of the counting process (X, N) can be written using 
Jacod's formula as Cx,m = 0(A-^'^, A^'^, A:)0(A^'^, A^'^, X). As we have 
done above we can also write it: Cx,Af = £^^°0(A-^*'^, A^ '^,N). Noting 
that C^J^° = 0(A'^'"^, Aq'^ , X) and equating the two representations we have: 

<P{A^'\ A^'^, X)0(A^'^, A^^, N) = 0(A^'^, Ao^'^, X)<p{A^*'^ Af'^ N). 

CAR(DYN) says that A-^*'^ = A'-^'^ and in this right-censoring case we 
have A^'^ = A-^'^ (this is because Ot = Tt on {t < C} and A^+„ = A^ 
whatever the filtration). So if CAR(DYN) holds, the above equation yields 
0(A-^'"^, A^'^, X) = 0(A'^'"^, AjJ^'^, X). This must be true almost surely, for 
all (6*, ip), and moreover, we still have this equality if we stop the observation 
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at time t or at a (A'()-stopping time T. All that we have to prove is that this 
implies A'^'^ = A"^'"^, which is independent censoring. 



Let us begin with X a — 1 counting process and denote its jump time 

kT,X kX,X 
V( exp 

expA^'^ expAg 

because of left-continuity, we have also the equality of t = T and because the 



T. If we stop observation at t, we have on {T > t}, "^'^ ^kt,x = —^-ttt^, 



intensity is equal to zero after T, the equality holds for all t almost surely. 
Taking log and differentiating we obtain: 



Af'"-Aii-,'-^=Af'--Aor. (6) 

The likelihood has a limit when t ^ oo and at the limit we have ^ - ^x '^^^ .f - .x = 
from which we successively deduce = and = ttttx, 

Apj, expAgj, Aqj, Apj, Ap^ Aqj 

almost surely for all t on the support of the distribution of T. Combining 
this result with (jo)), we obtain A^ ' = A^ a.s., which for cag processes im- 
plies indistinguishability of the intensities and of the cumulative intensities. 
If the process may have several jumps Ti, T2, . . ., we first prove by the same 
reasoning that we have equality of the intensities on {t < Ti}, then using 
this result and again the same reasoning we have equality on ]Ti, T2] and so 
on. All this reasoning is symmetrical so we can also prove that independent 
censoring implies CAR(DYN). □ 

6.2 Extension to left-continuous R 

In some situations it is natural to consider response indicator processes which 
are left-continuous; case II right-censoring is an example. In that case it 
is not possible to directly associate to i? a counting process and hence to 
apply CAR(DYN). In order to extend the application of CAR(DYN) to such 
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processes, we will consider them as limits of right-continuous processes. Let 
us still consider the univariate case. Consider for instance the case where 
Rq = 1. (Rt) may be left-continuous at jumps of odd ranks: V2j+i,j > 0; 
the process can be written Rt = J2j>o'^lV2j,V2j+i]- Consider the sequence 
of processes R"^ = {Ej>oMv2j<t<v^"^^^[) A 1, defined by: Rq = I, = 
V2j+i + 1/n. The limit of (i?^") is (Rt). 

Theorem 8 Consider a process R = (Rt) which is right- continuous at up- 
ward jumps and may left- continuous at downward jumps. Consider a se- 
quence of right- continuous processes i?" = (-R") constructed as above; if each 
(X, i?") satisfies CAR(GCMP)-loc on r then ignorability holds for {X,R) on 
r. 

Proof. We note O^, the observed cr-field associated to i?". If (X, i?") sat- 
isfies CAR(GCMP)-loc on r then (see the proof of Theorem Q), we have 
for all n: on {R = r}, = is larger than 

O: O G O"" and it is clear that is a decreasing sequence of cr-fields: 
O = DnO'' = 0°°. By the Downward Levy Theorem (Williams, 1990) we 
have. Lq^ - lj.(e(,,^o)(^^ \^ ) ^ ^{eo,i^o){^r \^) - 

£{^,i')/{So,i'o) ^^^^^ Using again the Downward Levy Theorem we get C^^^n — 

n la 

C-xr^^ a.s. Moreover note that 7^" = 7?. because the process RC is determin- 
istically defined from R: this implies that /^^^I'j'j^'-^"'^"^ = . At the limit 
we have thus: £(^''^)/(^o,'/'o) _ £^/^o£^^^o which concludes the proof. A similar 
result could be obtained for upward jumps. □ 
As an example consider the case of Type II right-censoring where we 
have independent — 1 counting processes (X^*), i = 1, . . . , n and observation 
is stopped just after observing the d*^ event. Thus the response indicator 
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process R is not independent on the multivariate process X. In fact we have 
Rt = l{Xt<d}, where Xt = J2i^t- This is a case of a left-continuous process 
which has only one downward jump. Since R is Af-measurable C^^^° = 1 
(by property iv) and is thus obviously O-measurable which is CAR(GCMP). 
Consider now a slightly more sophisticated model which we call randomized 
Type II censoring in which we may stop observation after each event with 
a given probability depending of what have been observed. For instance let 
(Ti,T2, ...,T„) the times of occurrence of the first, second,..., events, and let 
the probability of stopping observation just after Tj (conditional on having 
observed X until Tj) be j = 1, . . . ,n; let C be the jump time of R 
[C = Tj, for some j). We consider R as the limit of the sequence of right- 
continuous processes i?" such that i?" = l{t>c+i/n}- We can easily verify 
that these observation processes satisfy CAR(DYN) (because future values 
of X are not used for defining the probability of stopping observation), and 
thus CAR(GCMP) by Theorem 5; thus R itself satisfies CAR(GCMP) by 
Theorem |H1 

7 Applications 

7.1 Right-censoring of counting processes with time- 
dependent covariable 

We consider the modelling of independent counting processes = (W^) 
with possibly time-dependent explanatory variables = {Zl). We ob- 
serve X* = {W^, Z*), i = 1, . . . ,n through a mechanism specified by R = 
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{B}^^ , . . . , R^"; R^^ , . . . , R^"). The {ZD are supposed to be completely ob- 
served so that -Rf = 1 for all t. We consider a family of probability measures 
|pf,7,i/> where 6 is the parameter of interest which parameterizes the dynam- 
ics of (W^) given the value of the explanatory variables, that is, the intensity 
of (W/) in the filtration (Xi) (defined as the family of a-fields A"/ = W^y Zl) 
depends on 9 only. Let 7 parameterize the marginal law of {ZD (this is pos- 
sible for external explanatory variables; if there is a causal effect of W on Z 
we must resort to joint modelling). Consider that assumption Al holds; for 
right-censoring this means that the set of times at which observation may be 
stopped is denumerable; this is not a limitation in practice: for instance in an 
epidemiological cohort we may say that observation may be stopped each day 
at a fixed hour (we generally do not have a precision better than one day). If 
ignorability holds for (X, R), we have on {R = r}, ^J^'"'^" = £j^''/'o/9o,7o,^o_ 
But we have also £^7^^°''^" = C^t'^^^rC^J^" , so that in terms of inference about 
6 we only need to compute C^l"^r- The conclusion is that although we can 
use the conditional likelihood of W given Z while we may consider the ig- 
norability condition for {X, R) , that is, the response indicator process may 
depend on both observed W and Z. 

We give a particular example of artificial right censoring of — 1 counting 
processes which is compatible with ignorability (assuming to simplify that 
there is no other source of random censoring). Consider a study where n 
subjects are potentially followed- up until a time t*; it is assumed that the 
processes (X*, Z*) are independent and identically distributed. Suppose that 
at a given time ti we make an analysis of the data. Using for instance max- 
imum likelihood estimators in a parametric model we can construct an esti- 
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mator 9i which by definition is (9^^ -measurable. Let us suppose for simphcity 
that Ztit > ti is known at ti, it is then possible to compute an estimator 
of the probability that subject i experiences the event before the end of the 
study PfjiW^, = l\Wtj^, ZtJ, where Wt and Zt denote cx-fields generated by 
the n processes up to time t. For reducing the cost of the study without 
reducing too much its power, we may decide to follow after ti only the sub- 
jects for whom P^(W7, = l|Wj^,2j,) > c, for some chosen c. In the GCMP 
this means putting R^' = for t > ti for those subjects with an estimated 
probability below c. It is clear that R is Oj-predictable which from Corollary 
^ implies ignorability on all r because Al holds. Note that ignorability is 
not dependent of the good specification of the model used at ti, of course 
the validity of the final analysis will depend on the good specification of the 
model used for it. 

7.2 Longitudinal markers (continuous state-space pro- 
cesses) 
7.2.1 Fixed observation times 

This is the classical set-up of "repeated measurements" or "longitudinal 
data" (in a narrow sense). We consider independent continuous state-space 
processes {XI), i = l,...,n; for simplicity we do not consider covariates. 
Here (Xj) is planned to be observed at Vj, j = l,...,m but there may 
be missing data. The response indicator processes for {XI) can be written: 
Rf = J2Y=i ^{t=v^^j^ where YJ are binary variables. The jump at Vj — 
of the compensator of the counting process associated to -R* in a filtration 
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(gt) is PCVf = l\g^,_). CAR(DYN) can thus be expressed as: P(Yf = 
1\YI,...,Y:. ,W) = P{Yj = 1\YI,...,Y^. ,YlWi,...,Y^. ), which 

imphes (by taking conditional expectation) that 

P(17 = l|17,...,l- W:.J=P(17 = l|17,...,l^ 17H^,...,YJ W/^. ); 

this can intuitively be interpreted in saying that the missing data mechanism 
may depend (only) on the observed and on the Y^ up just before time 
f*; this case could be treated with the conventional MAR concept. 

7.2.2 Random observation times 

We may consider determining observation times in order to reduce the costs 
of a study or to improve the monitoring of patients. Consider the case of a 
study of the evolution of CD4 lymphocytes counts in HIV infected patients; 
CD4 counts at time t are represented by (^j); the time of the next visit, 
y]j^x may depend on CD4 count observed at V^^i = ^/ + fi'(-^yi)- Fo'^ 
instance we could decide to see a patient with a delay of three months if CD4 
> 500, two months if 200 < CD4 < 500 and one month if CD4 < 200. It is 
clear that CAR(DYN) would hold in this instance. On the contrary if the 
visit time was decided based in part on clinical symptoms not included in the 
model, or if drop-out could be due to severe clinical events or death (related 
to CD4 and not included in the model), CAR(DYN) would not hold. 

7.2.3 Vertically coarsened observations 

If we are interested in the evolution of HIV-RNA, we may consider the same 
issues as above ((^t) '^o"^ representing HIV-RNA), with the additional com- 
plexity of a (known) detection limit r]): this produces a left-censoring of X\: 
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if Rl = 1 we observe max{Xl, 77). This is what we have called "vertical coars- 
ening". Here the vertical m.l.i.d. is fixed so for studying ignorability it is 
sufficient to study the case without vertical coarsening. 

7.3 Joint modeling 

One of the reasons for considering a joint model is precisely to remove the 
bias due to so called "informative censoring ". Consider as in the example 
of 7.2.2 that we are interested in the evolution of CD4 lymphocytes counts 
represented by (Wl). Assume that subjects are lost from follow-up when 
they develop AIDS. There is a rather strong relationship between CD4 lym- 
phocytes counts and the risk of developing AIDS, so that the intensity of the 
counting process describing drop-out depends on unobserved values of CD4 
counts: CAR(DYN) does not hold in a model which does not include AIDS. 

Thus we may consider jointly modelling CD4 counts and AIDS and con- 
sider the process XI = {Wl, Y^, ZD, where represents CD4 counts, is 
a counting process which represents AIDS ans Zl is a multivariate process of 
explanatory variables. We may allow as in 7.2.2 that the visit times depend 
on the observed CD4 counts; we may also allow that the probability that 
the subject drop out after Vj depends on the AIDS status at this visit. The 
compensator of the associated counting process A^* will be constant between 
VJ and VJ_^_-^ and will make a jump at VJ^-^ equal to one minus the probability 

of drop-out: if the latter depends on the observed AIDS status Yyi only, 

j 

then CAR(DYN) holds; if it depends of unobserved status Y^ for t ^ Vj, 
then CAR(DYN) does not hold. 
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8 Conclusion 



We have proposed a general coarsening model for processes (GCMP) and 
developed a theory of ignorability in this framework. The theory applies 
to general stochastic processes having discrete or continuous state-space; in 
particular it applies to both counting processes and diffusion processes. The 
framework of repeated measurements can be represented as a continuous- 
time continuous state-space process observed at discrete times. Our results 
hold even if the observed part of the process of interest X has a null prob- 
ability, which is the case in the examples of the previous section. We have 
given a factorization condition for the likelihood which allows to get rid of the 
nuisance parameter and can be called weak ignorability; we can define ignor- 
ability in a strong sense, that is equality between the correct likelihood and 
the likelihood ignoring the m.l.i.d. on events {R = r} of non-null probabil- 
ity. This restriction comes from the fact that likelihoods, as Radon-Nikodym 
derivatives, are not uniquely defined. 

We have applied the results presented here to a multi-state model for de- 
mentia, institutionalization and death (Commenges and Gegout-Petit, 2005). 
This model can be represented by a three-variate counting process to which 
we associate the three-variate response process. In this application the ob- 
served event of X is generally of null probability although it is not a single- 
ton; institutionalization can be observed either exactly or in an interval; we 
showed that the m.l.i.d. could be ignorable. 
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